Time-based reward maximization

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Time-based reward maximization.

Humans and animals time intervals from seconds to minutes with high accuracy but limited precision. Consequently, time-based decisions are inevitably subjected to our endogenous timing uncertainty, and thus require temporal risk assessment. In this study, we tested temporal risk assessment ability of humans when participants had to withhold each subsequent response for a minimum duration to ear...

متن کامل

Eye Movements for Reward Maximization

Recent eye tracking studies in natural tasks suggest that there is a tight link between eye movements and goal directed motor actions. However, most existing models of human eye movements provide a bottom up account that relates visual attention to attributes of the visual scene. The purpose of this paper is to introduce a new model of human eye movements that directly ties eye movements to the...

متن کامل

Reward Maximization Under Uncertainty: Leveraging Side-Observations Reward Maximization Under Uncertainty: Leveraging Side-Observations on Networks

We study the stochastic multi-armed bandit (MAB) problem in the presence of sideobservations across actions that occur as a result of an underlying network structure. In our model, a bipartite graph captures the relationship between actions and a common set of unknowns such that choosing an action reveals observations for the unknowns that it is connected to. This models a common scenario in on...

متن کامل

Expectation Maximization for Average Reward Decentralized POMDPs

Planning for multiple agents under uncertainty is often based on decentralized partially observable Markov decision processes (DecPOMDPs), but current methods must de-emphasize long-term effects of actions by a discount factor. In tasks like wireless networking, agents are evaluated by average performance over time, both short and longterm effects of actions are crucial, and discounting based s...

متن کامل

Learning Multiple Models for Reward Maximization

We present an approach to reward maximiza-tion in a non-stationary mobile robot environment. The approach works within the realistic constraints of limited local sensing and limited a priori knowledge of the environment. It is based on the use of augmented Markov models (AMMs), a general modeling tool we have developed. AMMs are essentially Markov chains having additional statistics associated ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Philosophical Transactions of the Royal Society B: Biological Sciences

سال: 2014

ISSN: 0962-8436,1471-2970

DOI: 10.1098/rstb.2012.0461